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METHOD AND APPARATUS FOR REDUCING CONFLICTS 
BETWEEN SPEECH- ENABLED APPLICATIONS SHARING SPEECH MENU 

FIELD OF INVENTION 

The present invention relates to a method and apparatus 
for reducing conflicts between speech enabled applications 
sharing a speech menu. 

- 5 

BACKGROUND OF THE INVENTION 

In conventional systems, a number of electronic devices 
can be controlled using a Speech-Enabled Application ("SEA") 
which is executed using a computer. In addition, a plurality 
10 of SEAs may exist in a particular electronic device (e.g., a 

Consumer Electronic ("CE") device such as a stereo system and a 
Q television set) . Each command of the electronic device has a 
corresponding plurality of sound commands grouped together in 
'" M> a speech menu. A user, after activating a particular SEA 
154: device, issues a sound command (i.e., a word, phrase or tone) . 
\r- SEA matches the sound command to a corresponding execution 

command. Such matching is performed using tables or databases 
[ 4 of SEA where the sound command and the execution command are 
[O stored. Then, the execution command is sent to a processor of 
2C the electronic device for execution. 

yy There are several standards for constructing the speech 

03 menu of SEA. For example, Microsoft® Speech API ("Application 
Program Interface" or "SAPI") (Microsoft Corporation, Redmond, 
Washington) and Novel® Speech Recognition API ("SRAPI") (Novel 
25 Corporation, Ottawa, Canada) are two common standards for 
constructing the speech menu. 

Conventionally, the speech menus are professionally 
created by independent software vendors ("ISVs") and they are 
typically static (i.e., they cannot be adjusted by the user; 
3 0 only the ISVs can modify them) . A problem arises when the 

user is attempting to use simultaneously a number of SEAs that 
have different associated speech menus (e.g., these speech 
menus are not inter- operable) . A conflict may arise between 
SEAs attempting to share different speech menus. This problem 
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occurs due to the nature of SEA and the speech menu, and their 
ability to distinguish one word, phrase or tone accurately 
from another. Thus, there is a need for improved inter- 
operability between different SEAs . 

5 

SUMMARY OF THE INVENTION 

An embodiment of the present invention provides for a 
method and apparatus for developing a speech menu which is 
adapted to store a plurality of sound commands for a speech- 

-10 enabled application. A first sound command of the plurality 
of sound commands is compared to a second sound command to 
determine an accuracy value. If the accuracy value is less 
than a predetermined value, then at least one of the first 
sound command and the second sound command is replaced with a 

15= third sound command. 

- BRIEF DE SCRIPTION OF THE DRAWINGS 

r Figure 1 shows an exemplary embodiment of an apparatus 
I including a distance accuracy application according to an 
2 0". embodiment of the present invention. 

U* Figure 2a shows a first exemplary speech menu having two 
recognizable sound commands. 

25q3 Figure 2b shows a second exemplary speech menu having one 
recognizable sound command. 

Figure 3 shows a third speech menu which combines the first 
and second menus of Figures 2a- 2b. 

30 

Figure 4 shows an exemplary flow chart of the distance 
accuracy application according to an exemplary embodiment of 
the present invention. 

35 Figure 5 shows a final speech menu after an exemplary 
operation of the distance accuracy application. 
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DETAILED DESCRIPTION OF THE INVENTION 

According to an exemplary embodiment of the present 
invention, diverse, independent SEAs are enabled to share 
efficiently a common speech menu. Such sharing (discussed in 
detail below) is achieved by measuring the quality of each 
sound command in speech menus and by using a quality metric 
procedure to determine an acceptable sound command. For 
instance, the sound command may include (1) a vocal command 
issued by, e.g. , a human or robot, or (2) a tone command 
issued, e.g., a tone -producing apparatus, such as a telephone. 

The quality of the sound command is determined by 
analyzing the likelihood that one word, phrase or tone {e.g., 
the sound command) of the speech menu will be incorrectly 
interpreted as another word, phrase or tone of the speech 
menu. The method according to an exemplary embodiment of the 
present invention provides a distance accuracy application 
that determines the most optimal sound command based on other 
active SEAs . 

Figure 1 shows an apparatus 100 {e.g., a computer 
including a processor executing code, such as a Pentium II® 
processor, Intel Corporation, Santa Clara, California) 
executing a Speech-Enabled Application 110 ("SEA") and a 
Distance Accuracy Application 120 ("DAA") . Although DAA 120 of 
Figure 1 can work with different standards, in this exemplary 
embodiment, DAA 120 is being used in a SAPI implementation. 
SEA 110 and DAA 120 may be stored in, e.g., a memory 
arrangement, a processor, a microphone, and a speaker. 

Apparatus 100 can be coupled to at least one electronic 
device using, e.g., a serial connection, a parallel 
connection, a dedicated card connection, an internet 
connection, a wireless connection, etc. Execution of SEA 110 
by apparatus 100 controls a first electronic device (e.g., a 
personal computer ("PC")) and a second electronic device 
{e.g., a stereo). As shown in Figure 2a, SEA 110 stores a 
first speech menu which has a top level. The top level allows 
the user to issue sound commands, e.g., a first sound command 
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(e.g., "Turn PC off") or a second sound command (e.g., "Turn 
stereo off") . 

The user may also connect a further electronic device to 
apparatus 100 (e.g. , a television set). The further 
5 electronic device can also be controlled by SEA 110. As shown 
in Figure 2b, the further electronic device can be controlled 
by a third sound command (e.g., "Turn TV off") . It should be 
noted that the present invention does not impose limitations 
on how many SEA devices can be connected to apparatus 100 and 
.10 on how many SEA devices can be controlled by SEA 110. The SEA 
device may be a computer, a stereo system, a telephone, a 
video cassette recorder, a home appliance control device, a 
cordless computer access device, a lighting system, or any 
other suitable apparatus . 
15 As shown in Figure 3, a second speech menu includes e.g., 

JJ the first, second and third sound commands. As shall be 

described below, a vocal pronunciation of the exemplary first 
~- sound command and the exemplary second sound command is 
~ sufficiently distinctive. As such, SEA 110 has a high 

2 0;; : ; probability of differentiating a correct request. Such 

probability is determined using, for example, a conventional 
method of acoustical pattern matching ("APM") . 

An APM method compares the acoustical patterns of at 
least two sound commands provided thereto and determines an 
25 accuracy value (i.e., an indicator of how accurately can SEA 
110 differentiate between the two sound commands) . The 
accuracy value may range between, e.g., "0" and "1", where "1" 
represents the best possible accuracy and "0" represent the 
worst possible accuracy (e.g., duplicate phrases or tones). 

3 0 For example, the speech menu having words "should" and "could" 

would have a low accuracy value since acoustical patterns of 
these two exemplary words are very similar. On the other 
hand, words such as "shall" and "may" would have a high accuracy 
value since the acoustic patterns of these words are 
35 distinguishable. Here, the second sound command and the third 
sound command have similar acoustical patterns, thus the 
accuracy of SEA 110 is reduced. 
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Figure 4 shows a flow chart illustrating an exemplary 
operation of DAA 120 according to an embodiment of the present 
invention. In step 200, DAA 120 checks the first speech menu 
to determine a first accuracy value between the first and 
5 second sound commands. The first accuracy value may be, for 
example, 0.7 if the first sound command and the second sound 
command do not have similar acoustical patterns. 

In an alternative exemplary embodiment of the present 
invention, DAA 120 may also begin its operation with step 210 
40 without any sound commands existing in the first speech menu. 

After the third sound command is added to create the 
second speech menu (step 210) , a second accuracy value is 
determined by analyzing acoustical patterns of the first and 
third sound command against acoustical pattern (s) of the 
' 15C. second speech menu. The second accuracy value may be, e.g., 

J? 0.15 because "Turn off PC" for the first sound command and 
v LL "Turn off TV" for the third sound command have similar 

5 acoustical patterns. (See step 220) . In step 230, DAA 120 
Iff compares the second accuracy value to a standard (i.e., 
2&4 threshold) accuracy value. The standard accuracy value may be 
j\ determined as a function of the first accuracy value, an 
flJ average accuracy value and/or a predetermined accuracy value . 
Li The average accuracy value is determined based on average of 
; = prior accuracy values of DAA 120. 
2SB If the second accuracy value is less than the standard 

accuracy value then at least one sound command which causes 
the second accuracy value to be less than the standard 
accuracy value is replaced. For instance, DAA 120 may replace 
the third sound command with another sound command which 
3 0 provides similar meaning but has a different acoustical 

pattern. (See step 240) . The third sound command may be 
replaced with a fourth sound command [e.g., "Turn Television 
off") . In another embodiment of the present invention, the 
user may be asked to choose which sound command would be 
35 replaced (e.g., whether to replace the first sound command 
and/or the third sound command) . 
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Once again, the second accuracy value is determined for 
the second speech menu which now includes the fourth sound 
command. (See step 220) . DAA 120 replaces one of the sound 
commands of the second speech menu until the second accuracy 
value is greater than or equal to the standard accuracy value. 

When the second accuracy value is greater than or equal 
to the standard accuracy value then the second speech menu 
becomes a final speech menu. (See step 250) . As shown in 
Figure 5 the final speech menu now includes the first sound 
command and the fourth sound command which have sufficiently 
dissimilar acoustical patterns (i.e., synonyms). The third 
sound command is not a part of the final speech menu. 

Accurate and usable SEA 110 with DAA 12 0, provided by the 
method and apparatus according to an embodiment of the present 
invention, can be utilized advantageously for Cordless PC 
Access Capability Devices. A full range of such devices can 
use the sound command as a primary means of communication with 
the user. 

The method and apparatus according to an embodiment of 
present invention also enables a use of a broad range of SEAs 
to coexist on the same user interface {i.e., the top level or 
any other level) , thus allowing the ISVs to construct the SEA 
which use speech centric devices, such as, e.g., Cordless PC 
Access Capability Devices. The ISVs are capable of executing 
the SEA for the electronic device without conflicting with 
other installed applications. Thus, a broad range of SEAs may 
co-exist on the same interface. For example, the user may be 
able to control a home automation function, a electronic 
device and a PC assisted telephone from the same device. 

Another advantage of an embodiment of the present 
invention is that inter-operability between SEAs written by 
different ISVs is increased. Applications can be written by 
the ISVs that dynamically choose top level menu items 
depending on which items already exist. Without utilizing the 
present invention, one must manually construct a top level 
speech manager in order to prevent conflicts between SEAs. 
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Yet another advantage of the method and apparatus 
according to an embodiment of the present invention is that it 
allows SEAs to dynamically create the most appropriate sound 
commands and allows applications to support a wide variety of 
5 phrases based on the environment in which they are installed. 
A further advantage of an embodiment of the present 
invention is that it may be used by sound centric applications 
to constrict and present menus to the user. The SEA may 
utilize this embodiment to determine the quality of the local 
,10 menus and the effects of how their menus would perform when 
combined with other menus, for example, from the ISVs. 

Another advantage of an embodiment of the present 
invention is that the users' perception of the accuracy of SEA 
is increased. Choosing the correct sound commands has a 
'15 positive effect on the user regarding the accuracy of SEA. By 
^ applying DAA to the speech menus, the ISVs can easily choose 

sound commands that are more likely to provide accurate speech 
recognition. 

1 Several embodiments of the present invention are 

20-- specifically illustrated and/or described herein. However, it 
will be appreciated that modifications and variations of the 
HI present invention are covered by the above teachings and 
j=f within the purview of the appended claims without departing 
3 from the spirit and intended scope of the present invention. 
25.- 
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WHAT IS CLAIMED IS : 



1. A method for developing a speech menu which is adapted to 
store a plurality of sound commands for a speech- enabled 
application, comprising the steps of: 

a) comparing a first sound command of the plurality of 
sound commands to a second sound command to determine an 
accuracy value; and 

b) if the accuracy value is less than a predetermined 
value, replacing at least one of the first sound command and 
the second sound command with a third sound command. 

2. The method according to claim 1, further comprising the 
step of : 

c) before step (a) , adding the second sound command to 
the speech menu . 

3. The method according to claim 1, further comprising the 
step of : 

d) before step (b) , determining the predetermined value 
as a function of at least one of the accuracy value, a 
predetermined threshold value and an average accuracy value, 
the average accuracy value being determined as a function of 
an average of a plurality of prior accuracy values. 

4. The method according to claim 1, wherein step (a) 
includes the substep of : 

determining the accuracy value using an acoustical 
pattern matching procedure. 

5. The method according to claim 1, wherein at least one of 
the first and second sound commands and the third sound 
command are synonyms . 

6. The method according to claim 1, further comprising the 
step of: 



[2207/6002] 



e) repeating steps (a) - (b) for each sound command of 
the speech menu. 

7. The method according to claim 9, wherein each of the 
plurality of sound commands includes at least one of a 
word, a phrase and at least one tone. 

8 . A speech- enabled apparatus for developing a speech menu 
which is adapted to store a plurality of sound commands 
for a speech-enabled application, comprising: 

a distance accuracy module capable of comparing a first 
sound command of the plurality of sound commands to a second 
sound command in the speech menu to determine an accuracy 
value, the distance accuracy module capable of replacing at 
least one of the first sound command and the second sound 
command with a third sound command if the accuracy value is 
less than a predetermined value. 

9. The speech- enabled apparatus according to claim 8, 
wherein the speech-enabled apparatus includes a computer. 

10. The speech-enabled apparatus according to claim 8, 
wherein the speech- enabled apparatus is coupled to at 
least one device using at least one of a serial 
connection, a parallel connection, a dedicated card 
connection, an internet connection and a wireless 
connection . 

11. The speech-enabled apparatus according to claim 10, 
wherein the at least one device includes at least one of 
a computer, a stereo system, a telephone, a VCR, a home 
appliance control device, a cordless computer access 
device and a lighting system. 

12. The speech-enabled apparatus according to claim 8, 
wherein each of the plurality of sound commands includes 
at least one of a word, a phrase and at least one tone. 



[2207/6002] 



10 



13 . A set of instructions residing in a storage medium, the 
set of instructions capable of being executed by a 
processor to implement a development of a speech menu, 
the speech menu is adapted to store a plurality of sound 
commands for a speech-enabled application, the method 
comprising the steps of: 

a) comparing a first sound command of the plurality of 
sound commands to a second sound command to determine an 
accuracy value; and 

b) if the accuracy value is less than a predetermined 
value, replacing at least one of the first sound command and 
the second sound command with a third sound command. 

14. The set of instructions according to claim 13, wherein 
the method further comprising the step of: 

c) before step (a) , adding the second sound command to 
the speech menu. 

15. The set of instructions according to claim 13, wherein 
the method further comprising the step of : 

d) before step (b) , determining the predetermined value 
as a function of at least one of the accuracy value, a 
predetermined threshold value and an average accuracy value, 
the average accuracy value being determined as a function of 
an average of a plurality of prior accuracy values. 

16. The set of instructions according to claim 13, wherein 
the step (a) of the method includes the substep of: 
determining the accuracy value using an acoustical 

pattern matching procedure. 

17. The set of instructions according to claim 13, wherein at 
least one of the first and second sound commands and the 
third sound command are synonyms . 
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18. The set of instructions according to claim 13, wherein 
the method further comprising the step of : 

e) repeating steps (a) - (b) for each sound command of 
the speech menu. 

19. The set of instructions according to claim 13, wherein 
each of the plurality of sound commands includes at least 
one of a word, a phrase and at least one tone. 

20. A computer data signal embodied in a carrier wave to 
develop a speech menu, the speech menu being adapted to 
store a plurality of sound commands for a speech-enabled 
application, the computer data signal comprising: 

a) a comparison source code segment comparing a first 
sound command of the plurality of sound commands to a second 
sound command to determine an accuracy value; and 

b) a replacing source code segment replacing at least one 
of the first sound command and the second sound command with a 
third sound command if the accuracy value is less than a 
predetermined value . 
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ABSTRACT 



Method and apparatus are provided for developing a speech 
menu which is adapted to store a plurality of sound commands 
for a speech-enabled application. A first sound command of 
the plurality of sound commands is compared to a second sound 
command to determine an accuracy value. If the accuracy value 
is less than a predetermined value, then at least one of the 
first sound command and the second sound command is replaced 
with a third sound command. 

NY3-2531-5 
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Application ("SEA") 



100 
Apparatus 
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First Sound Command 
(e.g., "Turn PC off) 



Second Sound Command 
(e.g., "Turn stereo off) 



Third Sound Command 
(e.g , "Turn TV off) 




First Sound Command 
(e.g., "Turn PC off) 



Second Sound Command 
(e.g., "Turn stereo off) 



Third Sound Command 
(e.g., "Turn TV off) 




Second Sound Command 
(e.g., "Turn stereo off) 



Figure 5 
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